If you own a website and have given its users the ability to post content, such as comments or articles, you are bound to clash with malicious spammers at some point. They will come when you least expect them.
So, imagine that you woke up one day after posting an amazing article that you’ve put a lot of effort into. And you see that people are obviously interested in what you had to say, as it attracted over hundred comments. At first, it makes you happy. However, your happiness doesn’t last, as you discover that all of the comments are about Viagra.
The bad news is that this kind of thing is extremely common. The good news is, however, that it is extremely easy to protect your website against spammers. This is why you rarely, if ever, see these kind of comments on reputable websites.
How spam comments can cause damage
It may not be immediately clear how damaging the spam comments can be. After all, it’s not the same as breaking into the database, right? Well, there is actually quite a lot of damage spammers can cause.
They don’t tend to just leave random comments here and there. Once spammers found out that a particular page doesn’t block them from posting, they would keep posting on that page, until they are prevented from doing so. If left unchecked, the page will shortly be completely inundated with the comments and dodgy links, just like in the example above.
If this happens, your search engine rating will inevitably go down. You may have one of the best and most informative articles out there, but if there is more text on the page about Viagra, Cialis, Tramadol, online casinos and escort services than there is the actual useful content, search engine crawlers will struggle to recognize what keywords should the page be matched with.
Likewise, spammers don’t tend to just leave textual comments. They post links to some shady sites too. Those addresses are likely to be blacklisted by search engines; therefore your site may become blacklisted too.
Lastly, having these comments on your pages will not give your users a good impression. You can get an idea of what they will feel by imagining yourself opening a page with a lot of irrelevant content and some dodgy looking links that go with it.
I certainly wouldn’t trust the owners of a website that contains a lot of suspicious user-generated content. After all, if the website is not protected against the most obvious types of spam, it would be safe to assume that neither it is protected against other hacking methods, such as Cross Site Scripting.
My own encounter with spammers
My own website has been targeted by the spammers. At first, a large number of comments about Viagra and Cialis have appeared on a particular page that has criticized the actions of so-called “social justice warriors”.
At first, I thought of it as a political activism. The article happens to be about how over-zealous and ideologically possessed activists use technology to silence the people they don’t agree with. This is something these kind of people wouldn’t like; therefore I thought that the spam comments that came in such volume were an attempt to silence me. After all, the so called “social justice warriors” can be pretty vicious.
However, I have soon discovered that this is not the case. Another bunch of spammy comments, but this time about online casinos and illegal painkillers, started to appear on my article about WebAssembly. It is a technology for running certain types of code in browser; therefore, nobody would be emotionally attached to it enough to be willing to get an article about it moving down the search engine rank. Unless, of course, they happen to worship in-browser JavaScript and perceive WebAssembly as a threat to it.
This is what typical spammy comments looked like:
After some googling around, I found out that this is just what spammers do. Pages were chosen completely at random and the content on them didn’t matter at all.
Give the spammers a middle finger
Luckily, getting rid of all the spam comments was easy. So was stopping them from writing any new ones.
The easiest way to protect your website from spammers is to install one of the commenting plugins, such as Disqus. Those plugins already use various spam filters, so you wouldn’t have to worry about managing those yourself. If anybody manages to get past the filter and post something inappropriate on your page, you can report that comment to the plugin provider.
However, this solution is not suitable for every scenario. For example, you may want to choose to build your own comment widget and manage all the comments internally. The same applies to the websites where user-generated content can be more than just mere comments.
In this case, you can use the free service called Stop Forum Spam. It has its own plugin, but it also comes with a public API that can be queried. It is a large database of domains and IP addresses that belong to known spammers and it gets updated on regular basis. All you need to do is send a simple HTTP request with an IP address of the comment author. The response will tell you whether the IP has been reported and, if so, how many times. Based on the content of the response, you can decide whether to allow the comment through or not.
The website has some code samples of how to call their API from the back-end code. However, those are limited to PHP. Nevertheless, the API is so intuitive, that it will not be difficult for any half-decent web developer to figure out how to use it with different languages.
Of course, there is a chance that any IP address that you may get spammed from has not yet been reported to Stop Forum Spam. Therefore, it makes sense to add some extra layers of security, such as disallowing HTML markup, looking for specific keywords and examining the message pattern.
However, in my personal usage so far, the API has been able to stop all incoming spam based on the IP address alone. This is in light of the fact that I still had many spam attempts after implementing the logic on my server. Not a single one of these attempts was successful.
It goes without saying that, although you shouldn’t let spammy messages posted on your web pages, you should log all of them nonetheless. When you do, it would be a good practice to record as much details as possible. You never know when this data may become handy.